Speaker Modeling from Selected Nei Recognitio
نویسنده
چکیده
This paper addresses the estimation of a speaker GMM through the selection and merging of a set of neighbors models for that speaker. The selection of the neighbors models is based on the likelihood score for the training data on a set of potential neighbor GMM. Once the neighbors models are selected, they are merged to give a model of the speaker, which can also be used as an a priori model for an adaptation phase. Experiments show that merging neighborhood models captures significant information about the speaker but doesn’t improve significantly compared to classical UBM-adapted GMM.
منابع مشابه
NEIMiner: nanomaterial environmental impact data miner
As more engineered nanomaterials (eNM) are developed for a wide range of applications, it is crucial to minimize any unintended environmental impacts resulting from the application of eNM. To realize this vision, industry and policymakers must base risk management decisions on sound scientific information about the environmental fate of eNM, their availability to receptor organisms (eg, uptake)...
متن کاملAcoustic hole filling for sparse enrollment data using a cohort universal corpus for speaker recognition.
In this study, the problem of sparse enrollment data for in-set versus out-of-set speaker recognition is addressed. The challenge here is that both the training speaker data (5 s) and test material (2~6 s) is of limited test duration. The limited enrollment data result in a sparse acoustic model space for the desired speaker model. The focus of this study is on filling these acoustic holes by h...
متن کاملData Selection with Kurtosis and Nasality Features for Speaker Recognition
We propose new data selection approaches based on speaker discriminability features, including kurtosis and a set of nasality features which exploit spectral properties of nasal speech sounds. Data selected based on the speaker discriminability features are used to implement end-to-end speaker recognition systems, which produce significant improvements when combined with the baseline system (wh...
متن کاملAutomatic prosodic modeling for speaker and task adaptation in text-to-speech
One of the most important demands for future TTS systems is their ability to improve naturalness when embedded in a particular task or application that requires a particular speaking style for a particular speaker. In this paper, we present a new prosodic modeling procedure for improving naturalness by adapting a TTS system to a new speaker and a new speaking style. The proposed procedure is an...
متن کاملEnsemble speaker modeling using speaker adaptive training deep neural network for speaker adaptation
In this paper, we introduce an ensemble speaker modeling using a speaker adaptive training (SAT) deep neural network (SAT-DNN). We first train a speaker-independent DNN (SIDNN) acoustic model as a universal speaker model (USM). Based on the USM, a SAT-DNN is used to obtain a set of speaker-dependent models by assuming that all other layers except one speaker-dependent (SD) layer are shared amon...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003